4 research outputs found

    Enhanced ontology-based text classification algorithm for structurally organized documents

    Get PDF
    Text classification (TC) is an important foundation of information retrieval and text mining. The main task of a TC is to predict the text‟s class according to the type of tag given in advance. Most TC algorithms used terms in representing the document which does not consider the relations among the terms. These algorithms represent documents in a space where every word is assumed to be a dimension. As a result such representations generate high dimensionality which gives a negative effect on the classification performance. The objectives of this thesis are to formulate algorithms for classifying text by creating suitable feature vector and reducing the dimension of data which will enhance the classification accuracy. This research combines the ontology and text representation for classification by developing five algorithms. The first and second algorithms namely Concept Feature Vector (CFV) and Structure Feature Vector (SFV), create feature vector to represent the document. The third algorithm is the Ontology Based Text Classification (OBTC) and is designed to reduce the dimensionality of training sets. The fourth and fifth algorithms, Concept Feature Vector_Text Classification (CFV_TC) and Structure Feature Vector_Text Classification (SFV_TC) classify the document to its related set of classes. These proposed algorithms were tested on five different scientific paper datasets downloaded from different digital libraries and repositories. Experimental obtained from the proposed algorithm, CFV_TC and SFV_TC shown better average results in terms of precision, recall, f-measure and accuracy compared against SVM and RSS approaches. The work in this study contributes to exploring the related document in information retrieval and text mining research by using ontology in TC

    Mitigation of packet loss with end-to-end delay in wireless body area network applications

    Get PDF
    The wireless body area network (WBAN) has been proposed to offer a solution to the problem of population ageing, shortage in medical facilities and different chronic diseases. The development of this technology has been further fueled by the demand for real-time application for monitoring these cases in networks. The integrity of communication is constrained by the loss of packets during communication affecting the reliability of WBAN. Mitigating the loss of packets and ensuring the performance of the network is a challenging task that has sparked numerous studies over the years. The WBAN technology as a problem of reducing network lifetime; thus, in this paper, we utilize cooperative routing protocol (CRP) to improve package delivery via end-to-end latency and increase the length of the network lifetime. The end-to-end latency was used as a metric to determine the significance of CRP in WBAN routing protocols. The CRP increased the rate of transmission of packets to the sink and mitigate packet loss. The proposed solution has shown that the end-to-end delay in the WBAN is considerably reduced by applying the cooperative routing protocol. The CRP technique attained a delivery ratio of 0.8176 compared to 0.8118 when transmitting packets in WBAN

    Designing and configuring context-aware semantic web applications

    Get PDF
    Context-aware services are attracting attention of world as the use of web services are rapidly growing. We designed an architecture of context-aware semantic web which provides on demand flexibility and scalability in extracting and mining the research papers from well-known digital libraries i.e. ACM, IEEE and SpringerLink. This paper proposes a context-aware administrations system, which supports programmed revelation and incorporation of setting dependent on Semantic Web administrations. This work has been done using the python programming language with a dedicated library for the semantic web analysis named as “Cubic-Web” on any defined dataset, in our case as we have used a dataset for extracting and studying several publications to measure the impact of context aware semantic web application on the results. We have found the average recall and averge accuracy for all the context aware research journals in our research work. Moreover, as this study is limited journal documents, other future studies can be approached by examining different types of publications using this advance research. An efficient system has been designed considering the parameters of research article meta-data to find out the papers from the web using semantic web technology. Parameters like year of publication, type of publication, number of contributors, evaluation methods and analysis method used in publication. All this data has been extracted using the designed context-aware semantic web technology

    Database techniques for resilient network monitoring and inspection

    Get PDF
    Network connection logs have long been recognized as integral to proper network security, maintenance, and performance management. This paper provides a development of distributed systems and write optimized databases: However, even a somewhat sizable network will generate large amounts of logs at very high rates. This paper explains why many storage methods are insufficient for providing real-time analysis on sizable datasets and examines database techniques attempt to address this challenge. We argue that sufficient methods include distributing storage, computation, and write optimized datastructures (WOD). Diventi, a project developed by Sandia National Laboratories, is here used to evaluate the potential of WODs to manage large datasets of network connection logs. It can ingest billions of connection logs at rates over 100,000 events per second while allowing most queries to complete in under one second. Storage and computation distribution are then evaluated using Elastic-search, an open-source distributed search and analytics engine. Then, to provide an example application of these databases, we develop a simple analytic which collects statistical information and classifies IP addresses based upon behavior. Finally, we examine the results of running the proposed analytic in real-time upon broconn (now Zeek) flow data collected by Diventi at IEEE/ACM Supercomputing 2019
    corecore